Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
Transbound Emerg Dis ; 69(5): e2443-e2455, 2022 Sep.
Article in English | MEDLINE | ID: covidwho-2053020

ABSTRACT

The porcine deltacoronavirus (PDCoV) is a newly discovered pig enteric coronavirus that can infect cells from various species. In Haiti, PDCoV infections in children with acute undifferentiated febrile fever were recently reported. Considering the great potential of inter-species transmission of PDCoV, we performed a comprehensive analysis of codon usage patterns and host adaptation profiles of 54 representative PDCoV strains with the spike (S) gene. Phylogenetic analysis of the PDCoV S gene indicates that the PDCoV strains can be divided into five genogroups. We found a certain codon usage bias existed in the S gene, in which the synonymous codons are often ended with U or A. Heat map analysis revealed that all the PDCoV strains shared a similar codon usage trend. The PDCoV S gene with a dN/dS ratio lower than 1 reveals a negative selection on the PDCoV S gene. Neutrality analysis showed that natural selection is the dominant force in shaping the codon usage bias of the PDCoV S gene. Unexpectedly, host adaptation analysis reveals a higher adaptation level of PDCoV to Homo sapiens and Gallus gallus than to Sus scrofa. Compared to the USA lineage, the PDCoV strains in the Early China lineage and Thailand lineage were less adapted to their hosts, which indicates that the evolutionary process plays an important role in the adaptation ability of PDCoV. These findings of this study add to our understanding of PDCoV's evolution, adaptability, and inter-species transmission.


Subject(s)
Coronavirus Infections , Swine Diseases , Animals , Codon/genetics , Codon Usage , Coronavirus Infections/epidemiology , Coronavirus Infections/veterinary , Deltacoronavirus , Genome, Viral/genetics , Phylogeny , Swine , Swine Diseases/epidemiology
2.
Virology ; 568: 56-71, 2022 03.
Article in English | MEDLINE | ID: covidwho-1665518

ABSTRACT

SARS-CoV-2, the seventh coronavirus known to infect humans, can cause severe life-threatening respiratory pathologies. To better understand SARS-CoV-2 evolution, genome-wide analyses have been made, including the general characterization of its codons usage profile. Here we present a bioinformatic analysis of the evolution of SARS-CoV-2 codon usage over time using complete genomes collected since December 2019. Our results show that SARS-CoV-2 codon usage pattern is antagonistic to, and it is getting farther away from that of the human host. Further, a selection of deoptimized codons over time, which was accompanied by a decrease in both the codon adaptation index and the effective number of codons, was observed. All together, these findings suggest that SARS-CoV-2 could be evolving, at least from the perspective of the synonymous codon usage, to become less pathogenic.


Subject(s)
COVID-19/epidemiology , COVID-19/virology , Codon Usage , Codon , Evolution, Molecular , Pandemics , SARS-CoV-2/genetics , Betacoronavirus/classification , Betacoronavirus/genetics , Gene Expression Regulation, Viral , Genome, Viral , Genomics/methods , Humans , Open Reading Frames , Organ Specificity , Phylogeny
3.
Genes Genet Syst ; 96(4): 165-176, 2021 Dec 16.
Article in English | MEDLINE | ID: covidwho-1574597

ABSTRACT

In genetics and related fields, huge amounts of data, such as genome sequences, are accumulating, and the use of artificial intelligence (AI) suitable for big data analysis has become increasingly important. Unsupervised AI that can reveal novel knowledge from big data without prior knowledge or particular models is highly desirable for analyses of genome sequences, particularly for obtaining unexpected insights. We have developed a batch-learning self-organizing map (BLSOM) for oligonucleotide compositions that can reveal various novel genome characteristics. Here, we explain the data mining by the BLSOM: an unsupervised AI. As a specific target, we first selected SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) because a large number of viral genome sequences have been accumulated via worldwide efforts. We analyzed more than 0.6 million sequences collected primarily in the first year of the pandemic. BLSOMs for short oligonucleotides (e.g., 4-6-mers) allowed separation into known clades, but longer oligonucleotides further increased the separation ability and revealed subgrouping within known clades. In the case of 15-mers, there is mostly one copy in the genome; thus, 15-mers that appeared after the epidemic started could be connected to mutations, and the BLSOM for 15-mers revealed the mutations that contributed to separation into known clades and their subgroups. After introducing the detailed methodological strategies, we explain BLSOMs for various topics, such as the tetranucleotide BLSOM for over 5 million 5-kb fragment sequences derived from almost all microorganisms currently available and its use in metagenome studies. We also explain BLSOMs for various eukaryotes, including fishes, frogs and Drosophila species, and found a high separation ability among closely related species. When analyzing the human genome, we found enrichments in transcription factor-binding sequences in centromeric and pericentromeric heterochromatin regions. The tDNAs (tRNA genes) could be separated according to their corresponding amino acid.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Genome, Human , Genome, Viral , SARS-CoV-2/genetics , Cluster Analysis , Codon Usage , Humans , Metagenomics/methods , Mutation , RNA, Transfer , Time Factors
4.
Viruses ; 13(9)2021 09 16.
Article in English | MEDLINE | ID: covidwho-1411085

ABSTRACT

Many viruses that cause serious diseases in humans and animals, including the betacoronaviruses (beta-CoVs), such as SARS-CoV, MERS-CoV, and the recently identified SARS-CoV-2, have natural reservoirs in bats. Because these viruses rely entirely on the host cellular machinery for survival, their evolution is likely to be guided by the link between the codon usage of the virus and that of its host. As a result, specific cellular microenvironments of the diverse hosts and/or host tissues imprint peculiar molecular signatures in virus genomes. Our study is aimed at deciphering some of these signatures. Using a variety of genetic methods we demonstrated that trends in codon usage across chiroptera-hosted CoVs are collaboratively driven by geographically different host-species and temporal-spatial distribution. We not only found that chiroptera-hosted CoVs are the ancestors of SARS-CoV-2, but we also revealed that SARS-CoV-2 has the codon usage characteristics similar to those seen in CoVs infecting the Rhinolophus sp. Surprisingly, the envelope gene of beta-CoVs infecting Rhinolophus sp., including SARS-CoV-2, had extremely high CpG levels, which appears to be an evolutionarily conserved trait. The dissection of the furin cleavage site of various CoVs infecting hosts revealed host-specific preferences for arginine codons; however, arginine is encoded by a wider variety of synonymous codons in the murine CoV (MHV-A59) furin cleavage site. Our findings also highlight the latent diversity of CoVs in mammals that has yet to be fully explored.


Subject(s)
Chiroptera/virology , Codon Usage , Coronavirus/genetics , Evolution, Molecular , Animals , Furin/metabolism , Genetic Variation , Genome, Viral
5.
Viruses ; 13(9)2021 09 10.
Article in English | MEDLINE | ID: covidwho-1411076

ABSTRACT

The Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the third human-emerged virus of the 21st century from the Coronaviridae family, causing the ongoing coronavirus disease 2019 (COVID-19) pandemic. Due to the high zoonotic potential of coronaviruses, it is critical to unravel their evolutionary history of host species breadth, host-switch potential, adaptation and emergence, to identify viruses posing a pandemic risk in humans. We present here a comprehensive analysis of the composition and codon usage bias of the 82 Orthocoronavirinae members, infecting 47 different avian and mammalian hosts. Our results clearly establish that synonymous codon usage varies widely among viruses, is only weakly dependent on their primary host, and is dominated by mutational bias towards AU-enrichment and by CpG avoidance. Indeed, variation in GC3 explains around 34%, while variation in CpG frequency explains around 14% of total variation in codon usage bias. Further insight on the mutational equilibrium within Orthocoronavirinae revealed that most coronavirus genomes are close to their neutral equilibrium, the exception being the three recently infecting human coronaviruses, which lie further away from the mutational equilibrium than their endemic human coronavirus counterparts. Finally, our results suggest that, while replicating in humans, SARS-CoV-2 is slowly becoming AU-richer, likely until attaining a new mutational equilibrium.


Subject(s)
COVID-19/epidemiology , COVID-19/virology , Codon Usage , Genome, Viral , Mutation , SARS-CoV-2/genetics , Selection, Genetic , Evolution, Molecular , Host-Pathogen Interactions/genetics , Humans , Pandemics
6.
Brief Bioinform ; 22(2): 1006-1022, 2021 03 22.
Article in English | MEDLINE | ID: covidwho-1387712

ABSTRACT

Interaction of SARS-CoV-2 spike glycoprotein with the ACE2 cell receptor is very crucial for virus attachment to human cells. Selected mutations in SARS-CoV-2 S-protein are reported to strengthen its binding affinity to mammalian ACE2. The N501T mutation in SARS-CoV-2-CTD furnishes better support to hotspot 353 in comparison with SARS-CoV and shows higher affinity for receptor binding. Recombination analysis exhibited higher recombination events in SARS-CoV-2 strains, irrespective of their geographical origin or hosts. Investigation further supports a common origin among SARS-CoV-2 and its predecessors, SARS-CoV and bat-SARS-like-CoV. The recombination events suggest a constant exchange of genetic material among the co-infecting viruses in possible reservoirs and human hosts before SARS-CoV-2 emerged. Furthermore, a comprehensive analysis of codon usage bias (CUB) in SARS-CoV-2 revealed significant CUB among the S-genes of different beta-coronaviruses governed majorly by natural selection and mutation pressure. Various indices of codon usage of S-genes helped in quantifying its adaptability in other animal hosts. These findings might help in identifying potential experimental animal models for investigating pathogenicity for drugs and vaccine development experiments.


Subject(s)
Biological Evolution , Codon Usage , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/genetics , Angiotensin-Converting Enzyme 2/metabolism , Animals , Humans , Models, Animal , Mutation , RNA, Transfer/genetics , Spike Glycoprotein, Coronavirus/metabolism
7.
Genome Biol Evol ; 13(10)2021 10 01.
Article in English | MEDLINE | ID: covidwho-1370777

ABSTRACT

Owing to a lag between a deleterious mutation's appearance and its selective removal, gold-standard methods for mutation rate estimation assume no meaningful loss of mutations between parents and offspring. Indeed, from analysis of closely related lineages, in SARS-CoV-2, the Ka/Ks ratio was previously estimated as 1.008, suggesting no within-host selection. By contrast, we find a higher number of observed SNPs at 4-fold degenerate sites than elsewhere and, allowing for the virus's complex mutational and compositional biases, estimate that the mutation rate is at least 49-67% higher than would be estimated based on the rate of appearance of variants in sampled genomes. Given the high Ka/Ks one might assume that the majority of such intrahost selection is the purging of nonsense mutations. However, we estimate that selection against nonsense mutations accounts for only ∼10% of all the "missing" mutations. Instead, classical protein-level selective filters (against chemically disparate amino acids and those predicted to disrupt protein functionality) account for many missing mutations. It is less obvious why for an intracellular parasite, amino acid cost parameters, notably amino acid decay rate, is also significant. Perhaps most surprisingly, we also find evidence for real-time selection against synonymous mutations that move codon usage away from that of humans. We conclude that there is common intrahost selection on SARS-CoV-2 that acts on nonsense, missense, and possibly synonymous mutations. This has implications for methods of mutation rate estimation, for determining times to common ancestry and the potential for intrahost evolution including vaccine escape.


Subject(s)
COVID-19/virology , Mutation , SARS-CoV-2/genetics , Codon Usage , Codon, Nonsense , Evolution, Molecular , Humans , Models, Genetic , Mutation Rate , Mutation, Missense , Polymorphism, Single Nucleotide , Selection, Genetic , Silent Mutation
8.
J Med Virol ; 93(9): 5630-5634, 2021 09.
Article in English | MEDLINE | ID: covidwho-1363678

ABSTRACT

Since the start of the coronavirus disease 2019 (COVID-19) pandemic, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has rapidly widespread worldwide becoming one of the major global public health issues of the last centuries. Currently, COVID-19 vaccine rollouts are finally upon us carrying the hope of herd immunity once a sufficient proportion of the population has been vaccinated or infected, as a new horizon. However, the emergence of SARS-CoV-2 variants brought concerns since, as the virus is exposed to environmental selection pressures, it can mutate and evolve, generating variants that may possess enhanced virulence. Codon usage analysis is a strategy to elucidate the evolutionary pressure of the viral genome suffered by different hosts, as possible cause of the emergence of new variants. Therefore, to get a better picture of the SARS-CoV-2 codon bias, we first identified the relative codon usage rate of all Betacoronaviruses lineages. Subsequently, we correlated putative cognate transfer ribonucleic acid (tRNAs) to reveal how those viruses adapt to hosts in relation to their preferred codon usage. Our analysis revealed seven preferred codons located in three different open reading frame which appear preferentially used by SARS-CoV-2. In addition, the tRNA adaptation analysis indicates a wide strategy of competition between the virus and mammalian as principal hosts highlighting the importance to reinforce the genomic monitoring to prompt identify any potential adaptation of the virus into new potential hosts which appear to be crucial to prevent and mitigate the pandemic.


Subject(s)
Betacoronavirus/genetics , Codon Usage , Coronavirus Infections/virology , Genome, Viral , Mammals , SARS-CoV-2/genetics , Animals , COVID-19 , COVID-19 Vaccines , Codon , Host-Pathogen Interactions , Humans , Mutation , Open Reading Frames , Phylogeny , RNA, Transfer
9.
Virology ; 562: 149-157, 2021 10.
Article in English | MEDLINE | ID: covidwho-1331287

ABSTRACT

Six candidate overlapping genes have been detected in SARS-CoV-2, yet current methods struggle to detect overlapping genes that recently originated. However, such genes might encode proteins beneficial to the virus, and provide a model system to understand gene birth. To complement existing detection methods, I first demonstrated that selection pressure to avoid stop codons in alternative reading frames is a driving force in the origin and retention of overlapping genes. I then built a detection method, CodScr, based on this selection pressure. Finally, I combined CodScr with methods that detect other properties of overlapping genes, such as a biased nucleotide and amino acid composition. I detected two novel ORFs (ORF-Sh and ORF-Mh), overlapping the spike and membrane genes respectively, which are under selection pressure and may be beneficial to SARS-CoV-2. ORF-Sh and ORF-Mh are present, as ORF uninterrupted by stop codons, in 100% and 95% of the SARS-CoV-2 genomes, respectively.


Subject(s)
Codon Usage , Genes, Overlapping , Open Reading Frames , SARS-CoV-2/genetics , Evolution, Molecular , Genome, Viral , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/genetics , Statistics as Topic
10.
Genes Genomics ; 43(11): 1351-1359, 2021 11.
Article in English | MEDLINE | ID: covidwho-1296973

ABSTRACT

BACKGROUND: COVID-19, as a novel coronavirus disease caused by new coronavirus SARS-CoV-2, spreads all over the world, and brings harm to human in many countries. Humans suffered a lot from both SARS-CoV-2 now and by SARS-CoV in the year 2003. It is important to understand the differences and the relationships between these two types of viruses. OBJECTIVE: To compare relative synonymous codon usage of ORF1ab gene in SARS-CoV-2 and SARS-CoV, relative synonymous codon usage of their genomes are studied in this paper from the bioinformatics perspective. METHODS: The ORF1ab gene, which is an important non-structural polyprotein coding gene and now used for nucleic acid detection markers in many measurement method, in both SARS-CoV-2 (30 strains) and SARS-CoV (20 strains) are considered to be the research object in the present paper. The relative synonymous codon usage values of the ORF1ab gene are calculated to characterize the differences and the evolutionary characteristics among 50 strains. RESULTS: There is a significant difference between SARS-CoV and SARS-CoV-2 when the relative synonymous codon usage value of ORF1ab genes is concerned. The results suggest that codon usage pattern of SARS-CoV is more similar to human than that of the SARS-CoV-2, and that the inner difference in SARS-CoV-2 strains is larger than that of SARS-CoV, which denote the larger diversity exits in the SARS-CoV-2 virus. CONCLUSION: These results show that the relative synonymous codon usage values in the coronavirus could be used for further research on their evolutionary phenomenon.


Subject(s)
Codon Usage/genetics , Polyproteins/genetics , SARS-CoV-2/genetics , Severe acute respiratory syndrome-related coronavirus/genetics , Viral Proteins/genetics , COVID-19 , Computational Biology , Evolution, Molecular , Genome, Viral , Humans , Open Reading Frames , Phylogeny , SARS-CoV-2/classification
11.
Int J Mol Sci ; 22(12)2021 Jun 17.
Article in English | MEDLINE | ID: covidwho-1273459

ABSTRACT

The SARS-CoV-2 Spike glycoprotein (S protein) acquired a unique new 4 amino acid -PRRA- insertion sequence at amino acid residues (aa) 681-684 that forms a new furin cleavage site in S protein as well as several new glycosylation sites. We studied various statistical properties of the -PRRA- insertion at the RNA level (CCUCGGCGGGCA). The nucleotide composition and codon usage of this sequence are different from the rest of the SARS-CoV-2 genome. One of such features is two tandem CGG codons, although the CGG codon is the rarest codon in the SARS-CoV-2 genome. This suggests that the insertion sequence could cause ribosome pausing as the result of these rare codons. Due to population variants, the Nextstrain divergence measure of the CCU codon is extremely large. We cannot exclude that this divergence might affect host immune responses/effectiveness of SARS-CoV-2 vaccines, possibilities awaiting further investigation. Our experimental studies show that the expression level of original RNA sequence "wildtype" spike protein is much lower than for codon-optimized spike protein in all studied cell lines. Interestingly, the original spike sequence produces a higher titer of pseudoviral particles and a higher level of infection. Further mutagenesis experiments suggest that this dual-effect insert, comprised of a combination of overlapping translation pausing and furin sites, has allowed SARS-CoV-2 to infect its new host (human) more readily. This underlines the importance of ribosome pausing to allow efficient regulation of protein expression and also of cotranslational subdomain folding.


Subject(s)
RNA, Viral/metabolism , Ribosomes/metabolism , SARS-CoV-2/metabolism , Spike Glycoprotein, Coronavirus/genetics , Animals , Base Sequence , COS Cells , COVID-19/pathology , COVID-19/virology , Chlorocebus aethiops , Codon Usage , HEK293 Cells , Humans , Mutagenesis , SARS-CoV-2/isolation & purification , Sequence Alignment , Spike Glycoprotein, Coronavirus/metabolism
12.
Biomolecules ; 11(6)2021 06 18.
Article in English | MEDLINE | ID: covidwho-1273388

ABSTRACT

The ongoing outbreak of coronavirus disease COVID-19 is significantly implicated by global heterogeneity in the genome organization of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The causative agents of global heterogeneity in the whole genome of SARS-CoV-2 are not well characterized due to the lack of comparative study of a large enough sample size from around the globe to reduce the standard deviation to the acceptable margin of error. To better understand the SARS-CoV-2 genome architecture, we have performed a comprehensive analysis of codon usage bias of sixty (60) strains to get a snapshot of its global heterogeneity. Our study shows a relatively low codon usage bias in the SARS-CoV-2 viral genome globally, with nearly all the over-preferred codons' A.U. ended. We concluded that the SARS-CoV-2 genome is primarily shaped by mutation pressure; however, marginal selection pressure cannot be overlooked. Within the A/U rich virus genomes of SARS-CoV-2, the standard deviation in G.C. (42.91% ± 5.84%) and the GC3 value (30.14% ± 6.93%) points towards global heterogeneity of the virus. Several SARS-CoV-2 viral strains were originated from different viral lineages at the exact geographic location also supports this fact. Taking all together, these findings suggest that the general root ancestry of the global genomes are different with different genome's level adaptation to host. This research may provide new insights into the codon patterns, host adaptation, and global heterogeneity of SARS-CoV-2.


Subject(s)
COVID-19/virology , Codon Usage , Genome, Viral , SARS-CoV-2/genetics , Evolution, Molecular , Humans , Mutation , Phylogeny
13.
Genomics ; 113(4): 2177-2188, 2021 07.
Article in English | MEDLINE | ID: covidwho-1233643

ABSTRACT

The prevailing COVID-19 pandemic has drawn the attention of the scientific community to study the evolutionary origin of Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2). This study is a comprehensive quantitative analysis of the protein-coding sequences of seven human coronaviruses (HCoVs) to decipher the nucleotide sequence variability and codon usage patterns. It is essential to understand the survival ability of the viruses, their adaptation to hosts, and their evolution. The current analysis revealed a high abundance of the relative dinucleotide (odds ratio), GC and CT pairs in the first and last two codon positions, respectively, as well as a low abundance of the CG pair in the last two positions of the codon, which might be related to the evolution of the viruses. A remarkable level of variability of GC content in the third position of the codon among the seven coronaviruses was observed. Codons with high RSCU values are primarily from the aliphatic and hydroxyl amino acid groups, and codons with low RSCU values belong to the aliphatic, cyclic, positively charged, and sulfur-containing amino acid groups. In order to elucidate the evolutionary processes of the seven coronaviruses, a phylogenetic tree (dendrogram) was constructed based on the RSCU scores of the codons. The severe and mild categories CoVs were positioned in different clades. A comparative phylogenetic study with other coronaviruses depicted that SARS-CoV-2 is close to the CoV isolated from pangolins (Manis javanica, Pangolin-CoV) and cats (Felis catus, SARS(r)-CoV). Further analysis of the effective number of codon (ENC) usage bias showed a relatively higher bias for SARS-CoV and MERS-CoV compared to SARS-CoV-2. The ENC plot against GC3 suggested that the mutational bias might have a role in determining the codon usage variation among candidate viruses. A codon adaptability study on a few human host parasites (from different kingdoms), including CoVs, showed a diverse adaptability pattern. SARS-CoV-2 and SARS-CoV exhibit relatively lower but similar codon adaptability compared to MERS-CoV.


Subject(s)
COVID-19/genetics , Codon Usage/genetics , Evolution, Molecular , SARS-CoV-2/genetics , Base Composition/genetics , COVID-19/virology , Codon/genetics , Computational Biology , Genome, Viral/genetics , Humans , Nucleotides/genetics , Pandemics , SARS-CoV-2/pathogenicity
14.
J Mol Evol ; 89(6): 341-356, 2021 07.
Article in English | MEDLINE | ID: covidwho-1227833

ABSTRACT

Severe Acute Respiratory Syndrome Coronavirus-2 is a zoonotic virus with a possible origin in bats and potential transmission to humans through an intermediate host. When zoonotic viruses jump to a new host, they undergo both mutational and natural selective pressures that result in non-synonymous and synonymous adaptive changes, necessary for efficient replication and rapid spread of diseases in new host species. The nucleotide composition and codon usage pattern of SARS-CoV-2 indicate the presence of a highly conserved, gene-specific codon usage bias. The codon usage pattern of SARS-CoV-2 is mostly antagonistic to human and bat codon usage. SARS-CoV-2 codon usage bias is mainly shaped by the natural selection, while mutational pressure plays a minor role. The time-series analysis of SARS-CoV-2 genome indicates that the virus is slowly evolving. Virus isolates from later stages of the outbreak have more biased codon usage and nucleotide composition than virus isolates from early stages of the outbreak.


Subject(s)
COVID-19/epidemiology , COVID-19/virology , Codon Usage/genetics , Evolution, Molecular , Host-Pathogen Interactions/genetics , SARS-CoV-2/genetics , SARS-CoV-2/physiology , Adaptation, Physiological/genetics , Animals , COVID-19/transmission , Chiroptera/genetics , Genome, Viral/genetics , Humans , Mutation , Pandemics , Principal Component Analysis , Selection, Genetic/genetics , Time Factors , Virus Replication
15.
J Biomed Inform ; 118: 103801, 2021 06.
Article in English | MEDLINE | ID: covidwho-1219153

ABSTRACT

Understanding the molecular mechanism of COVID-19 pathogenesis helps in the rapid therapeutic target identification. Usually, viral protein targets host proteins in an organized fashion. The expression of any viral gene depends mostly on the host translational machinery. Recent studies report the great significance of codon usage biases in establishing host-viral protein-protein interactions (PPI). Exploring the codon usage patterns between a pair of co-evolved host and viral proteins may present novel insight into the host-viral protein interactomes during disease pathogenesis. Leveraging the similarity in codon usage patterns, we propose a computational scheme to recreate the host-viral protein-protein interaction network. We use host proteins from seventeen (17) essential signaling pathways for our current work towards understanding the possible targeting mechanism of SARS-CoV-2 proteins. We infer both negatively and positively interacting edges in the network. Further, extensive analysis is performed to understand the host PPI network topologically and the attacking behavior of the viral proteins. Our study reveals that viral proteins mostly utilize codons, rare in the targeted host proteins (negatively correlated interaction). Among them, non-structural proteins, NSP3 and structural protein, Spike (S), are the most influential proteins in interacting with multiple host proteins. While ranking the most affected pathways, MAPK pathways observe to be the worst affected during the SARS-CoV-2 infection. Several proteins participating in multiple pathways are highly central in host PPI and mostly targeted by multiple viral proteins. We observe many potential targets (host proteins) from the affected pathways associated with the various drug molecules, including Arsenic trioxide, Dexamethasone, Hydroxychloroquine, Ritonavir, and Interferon beta, which are either under clinical trial or in use during COVID-19.


Subject(s)
COVID-19 , Codon Usage , Host-Pathogen Interactions , Protein Interaction Maps , Signal Transduction , COVID-19/diagnosis , COVID-19/therapy , Humans
16.
FEBS J ; 288(17): 5201-5223, 2021 09.
Article in English | MEDLINE | ID: covidwho-1146926

ABSTRACT

Circulating animal coronaviruses occasionally infect humans. The SARS-CoV-2 is responsible for the current worldwide outbreak of COVID-19 that has resulted in 2 112 844 deaths as of late January 2021. We compared genetic code preferences in 496 viruses, including 34 coronaviruses and 242 corresponding hosts, to uncover patterns that distinguish single- and 'promiscuous' multiple-host-infecting viruses. Based on a codon usage preference score, promiscuous viruses were shown to significantly employ nonoptimal codons, namely codons that involve 'wobble' binding to anticodons, as compared to single-host viruses. The codon adaptation index (CAI) and the effective number of codons (ENC) were calculated for all viruses and hosts. Promiscuous viruses were less adapted hosts vs single-host viruses (P-value = 4.392e-11). All coronaviruses exploit nonoptimal codons to infect multiple hosts. We found that nonoptimal codon preferences at the beginning of viral coding sequences enhance the translational efficiency of viral proteins within the host. Finally, coronaviruses lack endogenous RNA degradation motifs to a significant degree, thereby increasing viral mRNA burden and infection load. To conclude, we found that promiscuously infecting coronaviruses prefer nonoptimal codon usage to remove degradation motifs from their RNAs and to dramatically increase their viral RNA production rates.


Subject(s)
COVID-19/genetics , Codon Usage/genetics , Evolution, Molecular , SARS-CoV-2/genetics , Animals , COVID-19/virology , Codon/genetics , Computational Biology , Genetic Code/genetics , Genome, Viral/genetics , Humans , Phylogeny , RNA, Messenger/genetics , SARS-CoV-2/pathogenicity , Viral Proteins/genetics
17.
Cell Rep ; 34(11): 108872, 2021 03 16.
Article in English | MEDLINE | ID: covidwho-1135279

ABSTRACT

Viruses need to hijack the translational machinery of the host cell for a productive infection to happen. However, given the dynamic landscape of tRNA pools among tissues, it is unclear whether different viruses infecting different tissues have adapted their codon usage toward their tropism. Here, we collect the coding sequences of 502 human-infecting viruses and determine that tropism explains changes in codon usage. Using the tRNA abundances across 23 human tissues from The Cancer Genome Atlas (TCGA), we build an in silico model of translational efficiency that validates the correspondence of the viral codon usage with the translational machinery of their tropism. For instance, we detect that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is specifically adapted to the upper respiratory tract and alveoli. Furthermore, this correspondence is specifically defined in early viral proteins. The observed tissue-specific translational efficiency could be useful for the development of antiviral therapies and vaccines.


Subject(s)
Protein Biosynthesis/genetics , Virus Diseases/genetics , Viruses/genetics , Cell Line , Cell Line, Tumor , Codon Usage/genetics , Genes, Neoplasm/genetics , HCT116 Cells , HEK293 Cells , HeLa Cells , Hep G2 Cells , Humans , Pulmonary Alveoli/virology , RNA, Transfer/genetics , Respiratory Tract Infections/virology , Tropism/genetics , Viral Proteins/genetics , Virus Diseases/virology
18.
Sci Rep ; 11(1): 4108, 2021 02 18.
Article in English | MEDLINE | ID: covidwho-1091453

ABSTRACT

In December 2019, rising pneumonia cases caused by a novel ß-coronavirus (SARS-CoV-2) occurred in Wuhan, China, which has rapidly spread worldwide, causing thousands of deaths. The WHO declared the SARS-CoV-2 outbreak as a public health emergency of international concern, since then several scientists are dedicated to its study. It has been observed that many human viruses have codon usage biases that match highly expressed proteins in the tissues they infect and depend on the host cell machinery for the replication and co-evolution. In this work, we analysed 91 molecular features and codon usage patterns for 339 viral genes and 463 human genes that consisted of 677,873 codon positions. Hereby, we selected the highly expressed genes from human lung tissue to perform computational studies that permit to compare their molecular features with those of SARS, SARS-CoV-2 and MERS genes. The integrated analysis of all the features revealed that certain viral genes and overexpressed human genes have similar codon usage patterns. The main pattern was the A/T bias that together with other features could propitiate the viral infection, enhanced by a host dependant specialization of the translation machinery of only some of the overexpressed genes. The envelope protein E, the membrane glycoprotein M and ORF7 could be further benefited. This could be the key for a facilitated translation and viral replication conducting to different comorbidities depending on the genetic variability of population due to the host translation machinery. This is the first codon usage approach that reveals which human genes could be potentially deregulated due to the codon usage similarities between the host and the viral genes when the virus is already inside the human cells of the lung tissues. Our work leaded to the identification of additional highly expressed human genes which are not the usual suspects but might play a role in the viral infection and settle the basis for further research in the field of human genetics associated with new viral infections. To identify the genes that could be deregulated under a viral infection is important to predict the collateral effects and determine which individuals would be more susceptible based on their genetic features and comorbidities associated.


Subject(s)
Betacoronavirus/genetics , Coronavirus Infections/genetics , Coronavirus Infections/virology , Codon/genetics , Codon Usage , Computational Biology/methods , Coronavirus/genetics , Coronavirus Infections/metabolism , Genes, Viral , Genome, Viral , Humans , Middle East Respiratory Syndrome Coronavirus/genetics , Phylogeny , Severe acute respiratory syndrome-related coronavirus/genetics , SARS-CoV-2/genetics
19.
Sci Rep ; 11(1): 3238, 2021 02 05.
Article in English | MEDLINE | ID: covidwho-1065946

ABSTRACT

The rampant spread of COVID-19, an infectious disease caused by SARS-CoV-2, all over the world has led to over millions of deaths, and devastated the social, financial and political entities around the world. Without an existing effective medical therapy, vaccines are urgently needed to avoid the spread of this disease. In this study, we propose an in silico deep learning approach for prediction and design of a multi-epitope vaccine (DeepVacPred). By combining the in silico immunoinformatics and deep neural network strategies, the DeepVacPred computational framework directly predicts 26 potential vaccine subunits from the available SARS-CoV-2 spike protein sequence. We further use in silico methods to investigate the linear B-cell epitopes, Cytotoxic T Lymphocytes (CTL) epitopes, Helper T Lymphocytes (HTL) epitopes in the 26 subunit candidates and identify the best 11 of them to construct a multi-epitope vaccine for SARS-CoV-2 virus. The human population coverage, antigenicity, allergenicity, toxicity, physicochemical properties and secondary structure of the designed vaccine are evaluated via state-of-the-art bioinformatic approaches, showing good quality of the designed vaccine. The 3D structure of the designed vaccine is predicted, refined and validated by in silico tools. Finally, we optimize and insert the codon sequence into a plasmid to ensure the cloning and expression efficiency. In conclusion, this proposed artificial intelligence (AI) based vaccine discovery framework accelerates the vaccine design process and constructs a 694aa multi-epitope vaccine containing 16 B-cell epitopes, 82 CTL epitopes and 89 HTL epitopes, which is promising to fight the SARS-CoV-2 viral infection and can be further evaluated in clinical studies. Moreover, we trace the RNA mutations of the SARS-CoV-2 and ensure that the designed vaccine can tackle the recent RNA mutations of the virus.


Subject(s)
COVID-19 Vaccines , Deep Learning , SARS-CoV-2/immunology , Spike Glycoprotein, Coronavirus/immunology , Allergens , COVID-19/prevention & control , COVID-19 Vaccines/adverse effects , COVID-19 Vaccines/chemistry , COVID-19 Vaccines/immunology , COVID-19 Vaccines/toxicity , Codon Usage , Computational Biology , Drug Design , Epitopes, B-Lymphocyte/immunology , Epitopes, T-Lymphocyte/immunology , Humans , Immunogenicity, Vaccine , Models, Molecular , Molecular Docking Simulation , Molecular Dynamics Simulation , Mutation , Protein Conformation , RNA, Viral , SARS-CoV-2/chemistry , SARS-CoV-2/genetics , Solubility , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/genetics , T-Lymphocytes, Cytotoxic/immunology , T-Lymphocytes, Helper-Inducer/immunology , Vaccines, Subunit/chemistry , Vaccines, Subunit/immunology
20.
Int J Mol Sci ; 21(21)2020 Oct 24.
Article in English | MEDLINE | ID: covidwho-895369

ABSTRACT

Transmissible gastroenteritis virus (TGEV) is a coronavirus associated with diarrhea and high mortality in piglets. To gain insight into the evolution and adaptation of TGEV, a comprehensive analysis of phylogeny and codon usage bias was performed. The phylogenetic analyses of maximum likelihood and Bayesian inference displayed two distinct genotypes: genotypes I and II, and genotype I was classified into subtypes Ia and Ib. The compositional properties revealed that the coding sequence contained a higher number of A/U nucleotides than G/C nucleotides, and that the synonymous codon third position was A/U-enriched. The principal component analysis based on the values of relative synonymous codon usage (RSCU) showed the genotype-specific codon usage patterns. The effective number of codons (ENC) indicated moderate codon usage bias in the TGEV genome. Dinucleotide analysis showed that CpA and UpG were over-represented and CpG was under-represented in the coding sequence of the TGEV genome. The analyses of Parity Rule 2 plot, ENC-plot, and neutrality plot displayed that natural selection was the dominant evolutionary driving force in shaping codon usage preference in genotypes Ia and II. In addition, natural selection played a major role, while mutation pressure had a minor role in driving the codon usage bias in genotype Ib. The codon adaptation index (CAI), relative codon deoptimization index (RCDI), and similarity index (SiD) analyses suggested that genotype I might be more adaptive to pigs than genotype II. Current findings contribute to understanding the evolution and adaptation of TGEV.


Subject(s)
Codon Usage , Evolution, Molecular , Transmissible gastroenteritis virus/genetics , CpG Islands , Genome, Viral , Selection, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL